Method and apparatus for predicting failure in a system

ABSTRACT

The invention regards a system reliability or failure predicting apparatus and method that incorporates known information about system component failure into a system model and uses the model with or without other acquired system data to predict the probability of system failure. An embodiment of the method includes using probabilistic methods to create a system failure model from the failure models of individual system components, predicting the failure of the system based on the component models and system data, ranking the sensitivity of the system to the system variables, and communicating a failure prediction.

The patent claims priority pursuant to 35 U.S.C. § 119(e)1 toprovisional application 60/260,449 filed Jan. 8, 2001.

FIELD OF THE INVENTION

This invention relates to a method and apparatus for predicting failureof a system. More specifically it relates to a method and apparatus forintegrating data measured from a system, and/or data referenced fromother sources, with component failure models to predict component oroverall system failure.

BACKGROUND OF THE INVENTION

Any product will eventually fail, regardless of how well it isengineered. Often failure can be attributed to structural, material, ormanufacturing defects, even for electronic products. A failure at thecomponent or sub-component level often results in failure of the overallsystem. For example, cracking of a piston rod can result in failure of acar, and loss of a solder joint can result in failure of an electroniccomponent. Such failures present safety or maintenance concerns andoften result in loss of market share.

A way to predict the impending failure of a system or component would beuseful to allow operators to repair or retire the component or systembefore the actual failure, and thus avoid negative consequencesassociated from an actual failure.

Accurate prediction of impending structural, mechanical, or systemfailure could have great economic impact to industries within theaerospace, automotive, electronics, medical device, appliance andrelated sectors.

Engineers currently attempt to design products for high reliability. Butit is most often the case that reliability information comes very latein the design process. Often a statistically significant amount ofreliability data is not obtained until after product launch and warrantyclaims from use by consumers. This lack of data makes it common forengineers to add robustness to their designs by using safety factors toensure that a design meets reliability goals.

Safety factors, however, are subjective in nature and usually based onhistorical use. Since modern manufacturers are incorporating newtechnology and manufacturing methods faster than ever before, exactlywhat safety factor is appropriate to today's new complex,state-of-the-art product is seldom, if ever, known with certainty. Thiscomplicates the engineering process. In addition, safety factors tend toadd material or structural components or add complexity to themanufacturing process. They are counterproductive where industry isattempting to cut cost or reduce weight. Designing cost effective andhighly reliable structures therefore requires the ability to reduce thesafety factor as much as possible for a given design.

In attempting to reduce reliance on safety factors, designers have, overthe years, developed models for the damage mechanisms that lead tofailures. Failures can be attributed to many different kinds of damagemechanisms such as fatigue, buckling, and corrosion. These models areused during the design process, usually through deterministic analysis,to identify feasible design concept alternatives. But poor or less thandesired reliability is often attributed to variability, anddeterministic analysis fails to account for variability.

Variability affects product reliability through any number of factorsincluding loading scenarios, environmental condition changes, usagepatterns, and maintenance habits. Even a system response to a steadyinput can exhibit variability, such as a steady flow pipe with varyingdegrees of corrosion.

Historically, testing has been the means for evaluating effects ofvariability. Unfortunately, testing is a slow, expensive process andevaluation of every possible source of variability is not practical.

Over the years, probabilistic techniques have been developed forpredicting variability and have been coupled with damage models offailure mechanisms to provide probabilistic damage models that predictthe reliability of a population. But, given variability, a prediction ofthe reliability of a population says little about the future life of anindividual member of the population. Safety factors are likewiseunsatisfactory methods for predicting the life of an individual sincethey are based on historical information obtained from a population.Safety factors are also an unsatisfactory method for quickly andefficiently designing against failure since they rely on historicalinformation obtained from test and component data. As a result, thereexists a need for a method and apparatus for accurately predictingcomponent and/or system failure that accounts for variability withoutthe need for extensive test data on the component and/or system.

SUMMARY OF THE INVENTION

The present invention is a method and apparatus for predicting systemfailure, or system reliability, using a computer implemented model ofthe system. In an embodiment of the invention that model relies uponprobabilistic analysis. Probabilistic analysis can incorporate anynumber of known failure mechanisms for an individual component, orcomponents, of a system into one model and from that model can determinethe critical variables upon which to base predictions of system failure.Failure can result from a number of mechanisms or combination ofmechanisms. A probabilistic model of the system can nest failuremechanisms within failure mechanisms or tie failure mechanisms to otherfailure mechanisms, as determined appropriate from analysis of theinter-relationships between both the individual failure mechanisms andindividual components. This results in a model that accounts for variousfailure mechanisms, including fatigue, loading, age, temperature, andother variables as determined necessary to describe the system. As aresult of probabilistic analysis, the variables that describe the systemcan also be ranked according to the effect they have on the system.

Probabilistic analysis of a system predicts system and/or componentfailure, or reliability, based on acquired data in conjunction with dataobtained from references and data inferred from the acquired data. Thisprediction of failure or reliability is then communicated to those usingor monitoring the system. Furthermore, the analyzed system can bestationary or mobile with the method or apparatus of analysis andcommunication of the failure prediction being performed either on thesystem or remotely from the system. In addition, the apparatus mayinterface with other computer systems, with these other computer systemssupplying the required data, or deciding whether and/or how tocommunicate a prediction.

An advantage of one embodiment of the invention is that it dividessystem variables into three types: directly sensed—those that changeduring operation or product use; referred—those that do not(significantly) change during operation or product use; andinferred—those that change during operation or use but are not directlysensed. This strategy divides the probabilistic approach into two broadcategories, pre-process off-board analysis and near real time on-boardor off-board analysis, allowing for prediction of a probability offailure based on immediate and historic use.

In one embodiment of the invention a computer implements a method forpredicting failure in a system. This method comprises: measuring dataassociated with a system; creating a prediction of a failure of thesystem using a model of the system and the data; and communicating theprediction to a user or operator.

A second embodiment of the invention is an apparatus for predictingfailure of a system. This apparatus comprises: sensors for acquiringdata from the system and a computer, with the computer having aprocessor and memory. Within the memory are instruction for measuringthe data from the sensors; instructions for creating a prediction of afailure of the system using a model and the data; and instructions forcommunicating the prediction. The apparatus also comprises communicationmeans for communicating the prediction.

A third embodiment of the invention is a computer program product forpredicting failure of a system for use in conjunction with a computersystem. The computer program product comprises a computer readablestorage medium and a computer program mechanism embedded therein. Thecomputer program mechanism comprises: instructions for receiving data;instructions for storing the data; instructions for creating aprediction of failure of the system using a model and the data; andinstructions for communicating this prediction. Furthermore, embodimentsof these apparatuses and method use a system model developed withprobabilistic methods.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other aspects and advantages of the present inventionwill be better understood from the following detailed description ofpreferred embodiments of the invention with reference to the drawings,in which:

FIG. 1 is a schematic illustrating an embodiment of an apparatus of thepresent invention employed on a dynamic system and an indication of theprocess flow;

FIGS. 2(a)-(d) illustrate a preferred embodiment of the off-boardengineering portion of a n embodiment of a method of the presentinvention;

FIGS. 3(a) and (b) illustrate an embodiment of the on-board failureprediction portion of the method also depicted in FIGS. 2(a)-(d);

FIG. 4 illustrates an embodiment of the invention employed in a staticsystem; and

FIGS. 5(a)-5(f) illustrate an example of the method of FIGS. 1, 2, and 3applied to a composite helicopter rotor hub.

Like reference numerals refer to corresponding elements throughout theseveral drawings.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

An embodiment of the present invention uses sensed data combined withprobabilistic engineering analysis models to provide a more accuratemethod for predicting the probability of failure of a component or asystem. This embodiment uses probabilistic analysis models to address,on a component by component basis, the effects of the random natureassociated with use, loading, material makeup, environmental conditions,and manufacturing differences. This embodiment assumes that theunderlying physics of the system behavior is deterministic and that therandom nature of the system response is attributed to the scatter(variability) in the input to the system and the parameters defining thefailure physics.

The underlying physics of the system behavior is captured by developinga system response model. This model, which represents the nominalresponse of the system, uses random variables as input parameters torepresent the random system behavior. The system response model may bebased on the explicit mathematical formulas of mechanics of materials,thermodynamics, etc. Computational methods such as finite elementanalysis and computational fluid analysis, are sometimes used to assessthe response of the system. Closely coupled with the system responsemodels are failure models. The failure models, which address bothinitial and progressive damage, may be either in the form of maximumload interactive criteria, or more specific models, which have beendeveloped by the system's original equipment manufacturers (OEMs), suchas crack growth models.

Probabilistic analysis then determines the variation in the globalsystem response as well as variation in the local system response. Thisprobabilistic analysis also quantitatively assesses the importance ofeach of the random variables on the variation in the system response.This allows for development of a rational design framework for decidingwhich variables need to be controlled and how to increase thereliability of the system. The embodiment of the invention incorporatingprobabilistic analysis, therefore, provides for more accuratepredictions of failure. Thus, this embodiment also provides a basis formore rational design decisions, while reducing expense and time tomarket.

FIG. 1 is a schematic illustrating an embodiment of an apparatus of thepresent invention employed on a dynamic system 22. System 22 is thisillustrative embodiment is an automobile with the embodiment describedas a device in the automobile, but dynamic system 22 could be anydynamic system, such as a helicopter, airplane, automobile, rail car,tractor, or an appliance. On-board Prognostic Instrument Engineer (OPIE)10, generally includes a central processing unit (CPU) 18; a computercontrol 20; a user alert interface 26; and sensors 24. The CPU 18receives input in the form of criteria, equations, models, and referencedata 14 derived from engineering analysis performed at step 12 and theOPIE 10 uses such input to make a failure prediction at step 16.

Engineering analysis step 12 essentially comprises the preparatory stepsthat produce the criteria, equations, models, and reference data 14 thatare used in failure prediction step 16 to assess the condition of thesystem or component of interest. Engineering analysis step 12 includesthe steps: identify failure mechanisms 40; model failure mechanisms 42;formulate probabilistic strategy 46; and determine warning criteria 48.Engineering analysis step 12 yields criteria, equations, models andreference data 14, which are further described and shown in FIG. 2(d).

Continuing with FIG. 1, criteria, equations, models and reference data14 are stored on a memory device 34 or incorporated into a computerprogram product within CPU 18 as a prediction analysis 30. Desiredcriteria from criteria, equations, models and reference data 14 may alsobe programmed into overall system computer control 20.

Sensors 24 send information to computer control 20. Sensors 24 measuredata on any number of conditions, such as temperature, speed, vibration,stress, noise, and the status and number of on/off cycles of varioussystems. Computer control 20 sends operation and sensor data 25 to CPU18. Operation and sensor data 25 includes data from sensors 24 inaddition to other data collected by computer control 20, such asignition cycles, light status, mileage, speed, and numbers ofactivations of other sub-systems on system 22. CPU 18 creates input 28by combining operation and sensor data 25 with information from memorydevice 34 and information from previous output data 32 that was storedin memory device 34.

CPU 18 analyzes input 28 as directed by prediction analysis 30 toproduce the output data 32. Output data 32 contains a prediction result29 and possibly other information. Output data 32 is then saved inmemory device 34 while prediction result 29 is sent to computer control20. Computer control 20 determines from criteria contained in criteria,equations, models and reference data 14, or from criteria developedseparately, whether and how to signal user alert interface 26 based onprediction result 29. These criteria could be incorporated into CPU 18instead, so that CPU 18 determined whether to activate user alertinterface 26.

User alert interface 26 is a number of individual components, withstatus, or alert indicators for each as is necessary for the systemsbeing analyzed for failure, such as, for example, a yellow light signalupon predicted failure exceeding stated threshold value. A variety ofuser alert signal devices could be appropriate for the specificsituation. Computer control 20 could also be configured to de-activatecertain components upon receipt of the appropriate prediction result,e.g., vehicle ignition could be disabled should prediction result 29indicate a brake failure.

FIGS. 2(a)-2(d) are flow charts depicting the operation of engineeringanalysis process step 12 (FIG. 2(a)) that results in creation ofcriteria, equations, models, and reference data 14 (FIG. 2(d)). In FIG.2(a) engineering analysis step 12 begins by identifying failuremechanisms at step 40 through review of warranty and failure data (step50) and research of literature (step 52) to determine which of theidentified failure mechanisms are actual active failure mechanisms (step54). This effort could incorporate discussions with component designstaff. Determination of active failure mechanisms can include a varietyof evaluations, discussions and interpretations of both component andsystem response.

Failure mechanisms describe how and why the component fails. Forexample, mechanisms for delamination in a multi-layered material couldinclude shear forces between the layers, adhesive decomposition, ormanufacturing defects. Failure mechanisms are then modeled at step 42 byevaluating failure physics (step 56) while also evaluating theinter-relationships between models (step 66). Evaluating failure physics(step 56) requires identifying models from the designer or openliterature (step 58), identifying the significant random variables (step59), evaluating and selecting the appropriate models (step 60), anddeveloping models for unique failure mechanisms (step 62) if no existingmodels are appropriate. Identifying the significant random variables(step 59) requires determining whether variation in a particularvariable changes the outcome of the system. If so, then that variable issignificant to some extent.

Inter-relationships between the selected models (step 66) are evaluatedby literature review and designer interview (step 68) with theappropriate models tied together appropriately to simulateinter-relationships (step 70). Tying the models together as isappropriate to simulate inter-relationships (step 70) necessarilyrequires identifying inputs and outputs for each model (step 72) and adeveloping a sequencing strategy (step 74). Identifying inputs andoutputs for each model also facilitates the developing a sequencingstrategy (step 74).

FIGS. 2(a)-2(c) show how to formulate probabilistic strategy at step 46.Formulating probabilistic strategy is a method for predicting theprobability of failure that considers the variability of the input andsystem parameters. Still referring to FIG. 2(a), the first step is tocharacterize variables (step 76). Variables are classified as those thatcan be directly sensed 78 or that can be inferred 80 from directlysensed information. Otherwise, variable values must come from referenceinformation 82. A part of characterizing variables (step 76) is also toidentify the randomness of each variable, i.e. determine the statisticalvariation of each variable.

Now referring to FIG. 2(b), formulation of probabilistic approach atstep 84 requires identifying and selecting an appropriate probabilistictechnique. Two primary probabilistic approaches may be appropriate forprediction analysis 30 (FIG. 1): fast probability methods (FPM), orsimulation techniques (ST). FPM include response surface FPM 88 anddirect FPM 92 techniques. A response surface approximates the failurephysics of the system with a single mathematical relationship. A directmethod can have disjoint mathematical relationship and is moresimplistic. ST include response surface ST 90 and direct ST 94 as well(FPM and ST techniques are discussed further with reference to FIG. 2(c)below, and see Ang and W. Tang, Probability Concepts in EngineeringPlanning and Design, Vols. I and II, John Wiley & Sons, 1975.). Severalfactors must be considered during selection of probabilistic strategy(step 46) including: CPU 18 computational capacity or limitations;whether it is possible to formulate a response surface equation; themathematical form of the selected failure models (steps 60, 62) (FIG.2(a)); the needed prediction accuracy; the characteristics of themonitored system; and the desired update speed or efficiency, amongothers. All factors are weighed in the balance by one of skill in theart, recognizing that engineering analysis 12 (FIG. 1) must determinewhich probabilistic technique is most appropriate for predictionanalysis 30 (FIG. 1) for the particular type of system 22 (FIG. 1).

The system itself may dictate the approach. Of the primary probabilistictechniques available for prediction analysis 30, direct FPM 92 and ST 94methods will always provide a solution to the system that facilitatesprediction analysis 30. Response surface FPM 88 and ST 90, however, donot always provide a workable solution. For example, a response surfacecannot be formed when considering variables that vary with time andpresent discontinuities. Direct methods are then necessary. Potentially,such a situation could be handled using multiple nested response surfaceequations, but a single response surface equation will not suffice.Where a response surface may be used, however, its use can increase theefficiency of the prediction calculations.

Referring to FIG. 2(c), FPM optional approaches include first orderreliability methods (FORM), second order reliability methods (SORM),advanced mean value (AMV) methods and mean value (MV) methods. SToptional approaches include Monte Carlo (MC) methods and importancesampling methods. These different methods are also discussed in furtherdetail in an Example within.

Response surface techniques, whether response surface FPM 88 or ST 90are divided into capacity and demand segments (steps 112, 118)respectively. For response surface FPM 88, one of the approaches ofFORM, SORM, AMV methods, or MV methods is used to produce a fullcumulative distribution function (CDF) for the capacity portion of theresponse surface equation (step 114). A CDF is a plot describing thespread or scatter in the results obtained from only the capacityportion. For response surface ST 90, either MC or importance samplingmethods are used to produce a full CDF for the capacity portion of theresponse surface equation 120. An equation is then fit to the CDF plots(steps 116, 122).

Often the capacity section is based on referenced data 82 (FIG. 2(a)),while the demand section is based on sensed data 78 and inferred data80. In such a case the equation from steps 116 and 122 produces afailure prediction for data representing referenced data 82, thecapacity section of the response surface. Example 1, within, furtherillustrates this situation.

Direct techniques FPM 92 or ST 94 also have both capacity and demanddesignations, but no response surface is involved. Direct methods aretherefore most often appropriate when a response surface cannot becreated. The first step in direct FPM is to establish a method forgenerating random variables and calculating the corresponding randomvariable derivatives (step 124). The next step is to establish a schemefor using the random variable derivatives in a failure model (step 126).The failure model is the one developed in model failure physics (step42) (FIGS. 1, 2(a)). The scheme established in step 126 serves toproduce many random variable derivatives for input into the failuremodel from step 42 (FIGS. 1, 2(a)). Then one must determine theconvergence criteria (step 128) to know when to cease inputting therandom variable derivatives into the failure model.

Similarly, direct ST 94 uses the failure model from model failurephysics (step 42). As with direct FPM, direct ST 94 must also create arandom variable generation method (step 130). But direct ST 94 does notcalculate derivatives of these random variables. The next step usingdirect ST 94 is to establish a method for using the random variablesthemselves in the failure model (step 132). And the last step is todetermine the number of simulations to be conducted (step 134), whichsometimes requires trial and error to determine the number ofsimulations necessary to give a failure prediction with the desiredprecision.

Returning to FIG. 2(b), the step 46 of formulating probabilisticstrategy continues with a determination of the analysis frequency (step96), or the frequency with which prediction analysis 30 (FIG. 1)analyzes input 28 (FIG. 1). To determine analysis frequency (step 96)one must determine how often relevant direct sensed data is acquired andprocessed (step 98), determine the fastest update frequency required(step 100) and determine the appropriate analysis frequency (step 102)for prediction analysis 30 (FIG. 1).

The last step 48 in engineering analysis step 12 (FIG. 1) is to developwarning criteria (FIG. 1). Continuing with FIG. 2(b), determiningwarning criteria 48 requires establishing the reliability or probabilityof failure (POF) threshold for sending a warning (step 104) based onprediction analysis 30 (FIG. 1). The next step is to set the level ofanalysis confidence needed before a warning signal is to be sent (step106) and then to develop a method for confidence verification prior tosending the warning (step 108). At some point, listed last here, onemust determine a type of warning appropriate for the system or user(step 110).

Now referring to FIG. 2(d), the results of the previous steps areprogrammed at step 136 into memory device 34 (FIG. 1) and CPU 18(FIG. 1) as appropriate criteria, equations, models, and reference data.For response surface FPM 88 or ST 90, the appropriate criteria,equations, models, and reference data 14 include: a mapping strategy foreach variable and response surface equation; a statistical distribution,or CDF, of the capacity portion of response surface equation; and ananalysis frequency strategy and warning criteria 138. The mappingstrategy essentially relates sensed, inferred, and referenced data tothe variable in the analysis that represents that data. For direct FPM92 the appropriate criteria, equations, models, and reference data 14include: a variable derivative method for FORM, SORM, AMV methods, or MVmethods analysis; a convergence criteria; and an analysis frequencystrategy and warning criteria 140. And for direct ST 94 the appropriatecriteria, equations, models, and reference data 14 include: a randomvariable generation method for MC or importance sampling analysis; anumber of simulations to be conducted; and an analysis frequencystrategy and warning criteria 142. One of ordinary skill in the art willknow to mesh the invention with the system of interest in a way thatallows both the invention and system to operate correctly.

FIGS. 3(a) and 3(b) are flow charts that illustrate the operation of thefailure prediction step 16 depicted schematically in FIG. 1. Referringto FIG. 3 a, the step of prediction analysis 30 (FIG. 1) on CPU 18(FIG. 1) receives the equations from criteria, equations, models, andreference data 14. Failure prediction is performed by CPU 18 in responseto operations and sensor data 25 received from computer control 20. CPU18 reads or receives operation and sensor data 25 from control computer20 according to the frequency strategy. Operation and sensor data 25 arecombined with referenced data 82 (FIG. 2(a)) from memory 34 to createinput 28. CPU 18 maps the data in input 28 to the appropriate variablesfor prediction analysis 30.

Continuing with FIG. 3(a), prediction analysis 30 follows differentpaths depending upon the technique chosen: probabilistic responsesurface FPM 88, or ST 90; probabilistic direct FPM 92; or probabilisticdirect ST 94.

For direct FPM 92, POF is determined at step 152 using FORM, SORM, AMVmethods or MV methods as previously determined (see FIG. 2(d)). Then POFis compared at step 160 to exceedence criteria and verified perconfidence criteria. Exceedence criteria for direct FPM 92 can bedefined as the state when POF exceeds the established reliability or POFwarning criteria threshold established at step 104 (FIG. 2(b)).

For direct ST 94, POF is determined at step 156 using MC or importancesampling methods as previously determined (see FIG. 2(d)). Then POF iscompared at step 160 to exceedence criteria and verified per confidencecriteria. Exceedence criteria can be defined as the state when POFexceeds the established warning criteria threshold value established atstep 104. An example applicable to direct techniques 92 or 94 is whereprediction analysis 30 determined POF at steps 152, 156 at 1.2 percentwhich was compared to POF threshold 104 of 1.0 percent, thusestablishing the need for a warning signal.

For response surface FPM 88 or ST 90, the demand portion of the responsesurface is calculated at step 146 and the POF is determined at step 148using the CDF equation. POF is then compared at step 160 to exceedencecriteria and verified per confidence criteria. Exceedence criteria canbe defined as the state when the demand portion of the response surfaceexceeds the capacity portion of the response surface that is determinedduring engineering analysis step 12 (FIG. 1).

An example applicable to response surface FPM 88 or ST 90 is where theCDF is represented by the simple equation POF=(constant)*(demand). Thedemand portion of the response surface calculated at step 146 yields atstep 148 a POF that is then compared to POF threshold 104. POF is thenverified using the method for confidence verification 108 (FIG. 2(b))with memory device 34 (FIG. 1). For these analysis methods, if POF asdetermined at steps 148, 152, 156 is compared and verified at step 160and meets the exceedence criteria, then in step 162 the warning criteriaare followed and a warning included in output data 32.

Output data 32 includes the variable readings; POF; selected warningcriteria; and warning information. For example, output warning criteriacould be to turn on a light when the calculated POF is greater than 1percent. The demand variable readings; calculated values; POF; andselected warning criteria are stored at step 164 in memory device 34 andthe appropriate warning information is communicated at step 166 asprediction results 29 to the vehicle computer control 20. Predictionresults 29 may contain only a portion of the information in output data32. The stored variable readings, POF, selected warning criteria andwarning information 164 serve as input for subsequent cycles.

Now referring to FIG. 3(b), at step 168 computer control 20 (FIG. 1)receives information from on-board sensors 24 and systems and sends theappropriate operation and sensor data (FIG. 1) to CPU 18 (FIG. 1),forming part of input 28 (FIG. 1). Operation and sensor data 25 includesdata from sensors 24 in addition to other data collected by computercontrol 20, such as ignition cycles, brake light status, mileage, speed,and numbers of activations of other systems on dynamic system 22.Computer control 20 also collects at step 172 warning signal informationas produced by CPU 18 and decides at step 178 if a signal should be sentto user alert interface 26. At step 176 user alert interface 26 receivesthe warning signal information from overall system computer control andat step 178 activates alerts as appropriate. User alert interface 26shows a number of individual components, with status, or alert,indicators for each as is necessary for the systems being analyzed forfailure, such as, for example, yellow light 27.

FIG. 4 is a schematic illustrating an embodiment of an apparatus of thepresent invention employed on a static system 22 and an indication ofthe process flow. Prognostic Instrument Engineering System (PIES) 11would be used where system 22 is a structure such as a bridge or amoving structure such as an airplane where the on-board information(from operation and sensor data 25) is used for predictions analysis 30using a CPU 18 that is not on the system 22. PIES 11 generally includesa central processing unit (CPU) 18; a computer control 20; a user alertinterface 26; and sensors 24. The CPU 18 receives input in the form ofcriteria, equations, models, and reference data 14 derived fromengineering analysis performed at step 12 and the PIES 11 uses suchinput to make a failure prediction at step 16. PIES 11 is substantiallysimilar to OPIE 10 (FIG. 1), a difference being that CPU 18 residesoff-board and thus communication device 23 is needed to transmit datafrom sensors 24 to overall system computer control 20.

Engineering analysis step 12 essentially comprises the preparatory stepsthat produce the criteria, equations, models, and reference data 14 thatare used in failure prediction step 16 to assess the condition of thesystem or component of interest. Engineering analysis step 12 includesthe steps: identify failure mechanisms 40; model failure mechanisms 42;formulate probabilistic strategy 46; and determine warning criteria 48.Engineering analysis step 12 yields criteria, equations, models andreference data 14, which were further described and shown in FIG. 2(d).

Continuing with FIG. 4, criteria, equations, models and reference data14 are stored on a memory device 34 or incorporated into a computerprogram product within CPU 18 as prediction analysis 30. Desiredcriteria from criteria, equations, models and reference data 14 may alsobe programmed into overall system computer control 20.

Sensors 24 measure data on any number of conditions, such astemperature, speed, vibration, stress, noise, and the status and numberof on/off cycles of various systems. Data acquired by sensors 24 aretransmitted via communication device 23 (for example: hard wire,satellite, and cell phone systems) 23 to computer control 20. Computercontrol 20 sends operation and sensor data 25 to CPU 18. Operation andsensor data 25 includes data from sensors 24 in addition to other datacollected by computer control 20, such as weather conditions. CPU 18creates input 28 by combining operation and sensor data 25 withinformation from memory device 34 and information from previous outputdata 32 that was stored in memory device 34.

CPU 18 analyzes input 28 as directed by prediction analysis 30 toproduce the output data 32. Output data 32 contains a prediction result29 and possibly other information. Output data 32 is then saved inmemory device 34 while prediction result 29 is sent to computer control20. Computer control 20 determines from criteria contained in criteria,equations, models and reference data 14, or from criteria developedseparately, whether and how to signal user alert interface 27 based onprediction result 29. These criteria could be incorporated into CPU 18instead, so that CPU 18 determined whether to activate user alertinterface 27.

User alert interface 27 is a number of individual components, withstatus, or alert indicators for each as is necessary for the systemsbeing analyzed for failure, such as, for example, a yellow light signalupon predicted failure exceeding stated threshold value. A variety ofuser alert signal devices could be appropriate for the specificsituation. Computer control 20 could also be configured to de-activatecertain components upon receipt of the appropriate prediction result.For example, if a POF for a bridge structure exceeded exceedencecriteria, the State Department of Transportation might request that ateam of engineers visually inspect the bridge. Another example might bethat PIES has predicted increased POF due to continuous heat cyclingthat may have degraded solder connections within an electroniccomponent. Here a signal would be sent to the overall system computercontrol 20 and a flash message would be sent as signal 38 to the systemoperator user alert interface 26.

The principles of the present invention are further illustrated by thefollowing example. This example describes one possible preferredembodiment for illustrative purposes only. The example does not limitthe scope of the invention as set forth in the appended claims.

EXAMPLE 1

The following example describes the modeling and prediction of failurein an exemplary embodiment according to the present invention.

FIGS. 5(a)-5(f) illustrate a preferred embodiment of the inventionapplied to a single dynamic component, namely a composite helicopterrotor hub. Reference numerals refer to the elements as they werediscussed with respect to FIGS. 1-4. In this example engineeringanalysis step 12 first incorporates a probabilistic approach usingresponse surface FPM 88 techniques. Thereafter, the same example is usedto demonstrate any difference that response surface ST 90, direct FPM92, or direct ST 94 would have yielded.

A helicopter rotor hub is a structure to which the blades of thehelicopter are attached. The rotor hub is a composite laminatestructure, which means that it is manufactured by laying plies ofcomposite sheets together and joining them (with an adhesive resin) toform an integral structure. Each composite sheet is called a ply. Duringflight, the rotor hub experiences continuous cyclic loading due torotation of the helicopter blades, which causes structural fatiguefailure. Upon inspection of failed hubs, it was determined that theinitial cause was a cracking problem in the composite rotor hub. Thus,an identified failure mechanism was the cracking in the rotor hub. FIG.5(a) shows a one-half schematic finite element model (FEM) of the hub.Upon closer examination, it was observed that cracking was occurring atthe laminate ply interfaces as depicted in FIG. 5(b). After reviewingliterature (failure reports in this case) and discussions with the partdesigner (step 52), the active failure mechanisms were determined (step54) to be the cracking at the laminate ply interfaces. This was causingcomposite ply delamination. Thus, in general, an identified failuremechanism from steps 40, 50, 52, and 54 generally illustrates how andwhy a part failed.

The next step was to model the failure mechanism 42. The first step inmodeling was to evaluate the failure physics (step 56). Discussions withthe part designer identified a model (step 58) used to model the failureof similar parts; virtual crack closure technique (VCCT). VCCT wasselected (step 60) to model the physics of delamination. VCCT was usedto calculate the strain energy release rate (G) at the delamination(crack) tip. If the calculated strain energy release rate exceeded thecritical strain energy release rate (G_(cis)) obtained from materialtests, delamination failure was assumed to have occurred. VCCT was usedto calculate the strain energy release rate (G) at the delamination tipsuch that: $\begin{matrix}{G = {G_{I} + G_{II}}} & {{Eq}.\quad(1)} \\{where} & \quad \\{{G_{I} = {- {\frac{1}{2\quad\Delta}\left\lbrack {{F_{ni}\left( {v_{k} - v_{k}^{\cdot}} \right)} + {F_{nj}\left( {v_{m} - v_{m}^{\cdot}} \right)}} \right\rbrack}}};{and}} & {{Eq}.\quad(2)} \\{G_{II} = {- {\frac{1}{2\Delta}\left\lbrack {{F_{ti}\left( {u_{k} - u_{k}^{\cdot}} \right)} + {F_{nj}\left( {u_{m} - u_{m}^{\cdot}} \right)}} \right\rbrack}}} & {{Eq}.\quad(3)}\end{matrix}$In Eq. 2 and 3, u and v are tangential and perpendicular nodaldisplacements respectively and F_(i) and F_(n) are the tangential andperpendicular nodal forces respectively. Delamination onset was assumedto occur when the calculated G exceeded the G_(crit) derived frommaterial delamination tests. Since VCCT was adequate for modeling thisfailure mechanism no unique model needed to be developed as in step 62.

In this case, seven significant random variables were identified at step59 and are shown in Table I, where:

-   -   E₁₁, Msi Longitudinal Young's modulus    -   E₂₂, Msi Transverse Young's modulus    -   G₁₃, Msi Shear modulus    -   v₁₃ Poisson's ratio    -   P, kips Tensile load    -   Φ, degrees Bending angle    -   G_(crit) Critical strain energy release rate

N Fatigue cycle TABLE I The Significant Random Variable for the ResponseSurface Fpm Example Random Variables Property Mean Std. Dev. E₁₁, Msi6.9 0.09 E₂₂, Msi 1.83 0.05 G₁₃, Msi 0.698 0.015 ν₁₃ 0.28 0.01 P, kips30.8 3.08 Φ, degrees 12 1.67 G_(crit) 448.56-58.57 Loge(N) 36.6 J²/mComputation of strain energy release rate, G, required determination ofnodal forces and displacements at the delamination tip as shown in FIG.5(b). Determination of the nodal forces and displacements requireddevelopment of a finite element model (FEM) for the rotor hub with theappropriate loads and material properties of the hub.

Referring to FIG. 5(d), the physics of failure for the rotor hubrequired a combination of different models. Once the models wereselected at step 60 or developed at step 62, the next step was toevaluate inter-relationships between models at step 66. This involvedidentifying the inputs and outputs of each model (step 72) as well asidentifying inter-relationships from the literature and designerinterviews in step 72. Then the models were tied at step 70 and theoverall model sequencing strategy developed at step 74. Since thematerial properties of the rotor hub were not readily available, theyhad to be derived from the material properties of the individualcomposite plies 180 using a laminate model 182 to give the laminatematerial properties 186. Laminate properties 186 and load data 184 wereinput into FEM 188 to yield nodal forces and displacements 190. Nodalforces and displacements 190 were input into VCCT 192 to yield strainenergy rate (G) 194.

FIG. 5(d) shows that the calculated strain energy release rate, G wasdetermined from the ply material properties 180 and the loads 184. G wasthe dependent variable and ply material properties and the loads werethe independent variables. The next step was to develop a probabilisticstrategy (step 46). First all the variables were characterized in step76 in terms of randomness and as directly sensed 78, inferred 80, orreferenced 82. P_(max) and Φ are the directly sensed variables and thematerial properties, including E₁₁, E₂₂, G₁₃, and v₁₃, and G_(crit) areinferred variables whose randomness is presented in Table I. The partdesigner had gathered test data on G_(crit) versus the number of fatiguecycles (N). Based on the statistical analysis of this G_(crit) vs. Ndata (see FIG. 5(c)), it was determined that G_(crit) is a Gaussian(normal) random variable with its mean value and standard deviationshown in Table I. There are several different probabilistic assessmentapproaches (step 84) available. Direct ST 94 and FPM 92 and responsesurface ST 90 and FPM 88 techniques were discussed earlier and thisexample will apply each to the rotor hub.

One such approach is to use the first order reliability method (FORM),which is an example of a fast probabilistic method (FPM), in conjunctionwith the response surface, referred to previously as a response surfaceFPM (88) approach. First a response surface must be developed relating Gto the independent variables. Developing a response surface is widelydiscussed in the open literature. See A. Ang and W. Tang, ProbabilityConcepts in Engineering Planning and Design, Vol. 1, John Wiley & Sons,1975. Based on the seven random variables a Design of Experiment (DOE)scheme was chosen as shown in Table II. TABLE II Design of ExperimentsScheme for Response Surface - FPM Approach Variable Trial 1 Trial 2Trial 3 Trial 4 Trial 5 Trial 6 Trial 7 E₁₁ 1 0 0 0 0 0 0 E₂₂ 0 1 0 0 00 0 G₁₃ 0 0 1 0 0 0 0 ν 0 0 0 1 0 0 0 P 0 0 0 0 1 0 0 Φ 0 0 0 0 0 1 0Gcrit 0 0 0 0 0 0 1 ↓ ↓ ↓ ↓ ↓ ↓ ↓ Strain Energy Rate E11 E₂₂ G₁₃ ν P ΦGcrit Sensitivity to:

In Table II, Trial 1, E₁₁ was changed from its nominal value (meanvalue, indicated as 1), while all the remaining six variables were keptat their respective mean values (indicated as 0) and the value of G 194was calculated. This process was repeated for each of the six othervariables. Following this step, a regression analysis was performed andan initial response surface was developed that related G to all theseven significant random variables. After this, an Analysis of Variance(ANOVA) was performed to determine if all the seven significant randomvariables needed to be included in the response surface. The ANOVAresults yielded that out of the seven random variables only 4 randomvariables (G_(crit), E₁₁, P and Φ) needed to be included in the responsesurface. Based on this, an updated DOE scheme was adopted as shown inFIG. 5(e) to create a quadratic response surface equation. Regressionanalysis yielded the final response surface equation shown in Eq. (4).This strategy was verified by input of data published in the openliterature and comparing the output results with results published inthe open literature.g=G _(crit)−175.344*(0.569−0.0861 E ₁₁+0.023P _(max)−0.117Φ−0.000546P ²_(max)+0.00376Φ²+0.0046P _(max)Φ)  Eq. (4)

The next step in response surface FPM 88 (FIG. 2(b)) approach is todivide the response surface into the capacity and demand segments (step112). The separation was as follows: Eq. (5) represents the capacitysegment of Eq. (4) and Eq. (6) represents the demand segment of Eq. (4).Capacity=G _(crit)−175.344*(0.569−0.0861E ₁₁)  Eq. (5)Demand=G _(crit)−175.344*(0.023P _(max)−0.117Φ−0.000546P ²_(max)+0.00376Φ²+0.0046P _(max)Φ)  Eq. (6)

For this particular example, the variables in the capacity section ofthe response 30 surface equation are the material property E₁₁ andG_(crit). The variables in the demand portion of the response surfaceequation are the load (P) and the angle of the load (Φ). Eq. (5) wasthen used to produce a full CDF for the capacity portion of the responsesurface equation (step 114). This CDF is shown in FIG. 5(f) withcapacity equated to the probability of failure.

Using FORM all the variables in the capacity portion of the responsesurface (E₁₁ and G_(crit)) are transformed to equivalent uncorrelatedstandard normal variables (Y1 and Y2). In the transformed uncorrelatedstandard normal space, a linear approximation is constructed to thecapacity portion of the response surface and is given by the equation:y=(9E−14)x ⁶−(9E−11)x ⁵+(3E−8)x ⁴−(5E−6)x ³+0.0004x ²−0.0087x  Eq. (7)To estimate the CDF using FORM a constrained optimization scheme isadopted to search for the minimum distance from the origin to thetransformed response surface. Mathematically, the problem can beformulated as:Minimize such that g(Y)=0  Eq. (8)where, β is the minimum distance and g(Y) is the transformed capacityportion of the response surface. Several optimization routines areavailable to solve the above-constrained optimization problem. Themethod used in this example was formulated by Rackwitz and Fiessler. SeeRackwitz, R. and Fiessler, B., Reliability Under Combined Random LoadSequences, Computers and Structures, Vol. 9, No. 5, pp. 489-494, 1978. Afirst order estimate of the failure probability is then computed as:CDF=1−F(−β)  Eq. (9)where F(−β) is the cumulative distribution function of a standard normalvariable (i.e., a normal variable with zero mean value and unit standarddeviation).

A graph of the resultant CDF is shown in FIG. 5(f). Althoughmathematical expressions exist to determine the CDF, these expressionsinvolve multiple integrals, which can be quite cumbersome to evaluate.Hence to make the process of CDF computation faster and more tractable,an equation was fit to the CDF plot in step 116 using traditional curvefit methods. For this example relevant direct sensed data was acquiredand processed (step 98) and POF could be predicted every flight cycle.Sensor data was also collected continuously during flight, but it wasdecided that POF would only be reviewed after every 2 flight cycles(step 102). It was then determined in step 104 that a POF greater than 1percent would trigger a warning “No-Go” signal that would in turnactivate a yellow light within user alert interface 26. Also it wasdecided that the method of confidence verification (step 108) was that,within the same flight cycle, a second POF will be determined based onupdated sensor data. If prediction analysis 30 returned a POF greaterthan 1 percent two successive times within the same flight cycle awarning should be sent in step 110. That warning would be the “No-Go”signal that activated yellow light 27. This completed engineeringanalysis step 12.

Criteria, equations, models, and reference data 14, consisting of thevariable mapping strategy, response surface equation, statisticaldistribution of capacity portion of the response surface equation,analysis frequency, and warning criteria were programmed into memory 34in step 138.

Failure prediction (step 16) began by sending sensor data on the twodirectly sensed variables (step 25), which for this example were P_(max)and Φ. The next step 146 was to compute the demand portion of theresponse surface equation. The result from the demand portion of theresponse surface (Eq. (6)) was then input into the CDF equation derivedfrom the capacity portion of the response surface equation (Eq. (7)).Thus, the current POF at demand was computed in step 148 based on thedirectly sensed data. The POF was calculated after every second flightcycle based on P_(max). For this example the calculated demand (orsensed/inferred data) contribution from step 146 to the POF is shown inTable III for the selected cycle numbers after acquisition and analysisof the appropriate sensed and inferred data. In this example thecapacity contribution is based on referenced data 82 and the demandcontribution is based on sensed data 78 and inferred data 80, but asdiscussed earlier in the specification with reference to steps 112 and118, the capacity and demand sections are not always based on the samedata types. TABLE III Response Surface - FPM Prediction Results Cycle Pθ (demand) POF Warning Cycle-1 30.8 12 68.52277995 0% Go Cycle-3 33.8810.33 44.54782678 0% Go Cycle-5 27.72 13.67 86.06125102 0% Go Cycle-736.96 15.33 181.6366807 15%  Go Cycle-7 37 16 201.9535041 17%  No-Go

Table III also shows the POF determined at step 148 that is compared tothe exceedence criteria at step 160. When POF exceeded one percent twiceconsecutively within the same cycle (cycle 7) the warning criteria wasfollowed in step 162 and a “No-Go” warning was issued as part of outputdata 32. Output data 32 included all the values from Table III. Thesewere stored in step 164 in memory 34 of CPU 18. Thus memory 34 storedcycle data that served as input for subsequent cycles. Of that data inTable III, in this example, only the warning or lack of warning of“No-Go” or “Go” was sent in step 166 to the equivalent of controlcomputer 20. After collecting the warning signal of “No-Go” in step 172,control computer 20 decided in step 174 that the warning requiredfurther communication to user alert interface system 26. Upon receipt ofthe warning at step 176, user alert interface 26 activated a yellowcockpit indicator light and highlighted “check rotor hub” on amalfunction monitor.

A difference between the response surface FPM 88 and ST 90 approaches isthe method used to create the CDF. This response surface ST approachused Monte Carlo (MC) methods to produce the CDF. Like the responsesurface FPM approach 88, the first step in the response surface ST 90approach was to separate the response surface equation into capacity anddemand portions at step 118. Following the division of the responsesurface, Monte Carlo simulation methods were used to develop the fullCDF of the capacity portion of the response surface at step 120. Foreach MC simulation, random values of G_(crit) and E₁₁ were generatedbased on their respective statistical distribution types and respectivestatistical parameters. With each set of G_(crit) and E₁₁ valuesgenerated, the capacity portion of the response surface equation wascomputed. Following that, a histogram analysis was performed to developthe CDF curve for the capacity portion of the response surface equation.

Once the CDF curve fit was developed at step 122 for the capacityportion of the response surface equation the failure prediction methodfollowed the steps outlined in the Response Surface FPM 88 approachfollowing this embodiment of the invention. Table IV shows the resultsof estimating the probability of failure using the response surface ST90 approach. TABLE IV Response Surface - ST Prediction Results Cycle P θ(demand) POF Warning Cycle-1 30.8 12 68.52277995 0% Go Cycle-3 33.8810.33 44.54782678 0% Go Cycle-5 27.72 13.67 86.06125102 0% Go Cycle-736.96 15.33 181.6366807 15%  Go Cycle-7 37 16 201.9535041 16%  No-Go

The direct FPM approach 92 does not require the development of aresponse surface to predict the probability of failure. This exampleused direct FORM to transform the seven random variables (in thisexample these variables are the material properties, G_(crit), P_(max)and Φ) to equivalent uncorrelated standard normal variables (representedby vector Y). After transformation, a numerical differentiation schemewas employed at step 124 to determine the derivatives of the randomvariables. In the transformed uncorrelated standard normal space, alinear approximation was constructed to the final failure equation,which in this case is G>G_(crit). The derivatives of the randomvariables were used at step 126 to determine the perturbed values of therandom variables. To estimate the probability of failure using FORM aconstrained optimization scheme was adopted to search for the minimumdistance from the origin to the transformed failure equation.Mathematically, the problem was formulated the same at Equation (8)where β was the minimum distance, but where g(Y) was the transformedfailure equation. The method used in this example was formulated byRackwitz and Fiessler optimization scheme and was used to solve theabove constrained optimization scheme. See Rackwitz, R. and Fiessler,B., Reliability Under Combined Random Load Sequences, Computers andStructures, Vol. 9, No. 5, pp. 489494, 1978. The constrainedoptimization scheme is an iterative process to estimate the probabilityof failure. A convergence criterion was determined at step 128 (FIG.2(c)) to force the iterations to converge on a failure probabilityestimate. After the appropriate criteria, equations, models, andreference data were programmed at step 136 into memory device 34, afirst order estimate of the POF was determined at step 152 using FORMas:POF=F(−β)  Eq. (10)

where F(−β) was the CDF of a standard normal variable (i.e., a normalvariable with zero mean value and unit standard deviation). Table Vshows example results from estimating the probability of failure usingDirect FPM approach. TABLE V Direct - FPM Prediction Results Cycle P θPOF Warning Cycle-1 30.8 12 0% Go Cycle-3 33.88 10.33 0% Go Cycle-527.72 13.67 0% Go Cycle-7 36.96 15.33 14%  Go Cycle-7 37 16 14%  No-Go

Like the direct FPM approach, the direct ST 94 approach also does notrequire the development of a response surface. This example also usedMonte Carlo (MC) methods within direct ST 94. The same seven significantvariables from Table I were selected. Based on the analysis frequency,previously determined to be two flight cycles, once the sensors gatheredthe values of the directly sensed variables, values of the inferredvariables were randomly generated in step 130 using MC methods andrandom values of G_(crit) and E₁₁ were generated based on theirrespective statistical distribution types and respective statisticalparameters. For each set of directly sensed data, several sets of theinferred variables were generated. For each set of inferred variablesgenerated, the value of the strain energy release rate G was computed instep 194 as shown in FIG. 5(d). The number of sets of referred variableswas based on the number of simulations to be conducted from step 134.Appropriate criteria, equations, models, and reference data were storedat step 136 in memory 34 of CPU 18.

For each simulation, if G>G_(crit) a failure counter was incremented byone. For example, let us assume that for each set of P_(max) and Φsensed, M sets of the inferred variables were generated. Among those Msets, for n sets (n≦M), G was greater than G_(crit). Then theprobability of failure would be n/M. Table VI shows example results fromestimating the probability of failure using Direct ST approach. TABLE VIDirect - ST Prediction Results Cycle P θ POF Warning Cycle-1 30.8 12 0%Go Cycle-3 33.88 10.33 0% Go Cycle-5 27.72 13.67 0% Go Cycle-7 36.9615.33 16%  Go Cycle-7 37 16 18%  No-Go

While the foregoing description and drawings represent embodiments ofthe present invention, it will be understood that various additions,modifications and substitutions may be made therein without departingform the spirit and scope of the present invention as defined in theaccompanying claims. In particular, it will be clear to those skilled inthe art that the present invention may be embodied in other specificforms, structures, arrangements, proportions, and with other elements,materials, and components, without departing from the spirit oressential characteristics thereof. The presently disclosed embodimentsare therefore to be considered in all respects as illustrative and notrestrictive, the scope of the invention being indicated by the appendedclaims, and not limited to the foregoing description.

1. A computer implemented method for predicting failure in a system,comprising: measuring data associated with a system; creating aprediction of a failure of said system using a probabilistic model andsaid data; and communicating said prediction. 2-24. (canceled)
 25. Anapparatus for predicting failure of a system, said apparatus comprising:sensors for acquiring data from a system; a first computer comprising: aprocessor; a memory containing: instructions for measuring said data;instructions for creating a prediction of a failure of said system usinga probabilistic model and said data; and instructions for communicatingsaid prediction; and a communication device for communicating saidprediction. 26-47. (canceled)
 48. A computer program product forpredicting failure of a system for use in conjunction with a computersystem, said computer program product comprising a computer readablestorage medium and a computer program mechanism embedded therein,instructions for measuring data; instructions for storing said data;instructions for creating a prediction of failure of said system using aprobabilistic model and said data; and instructions for communicatingsaid prediction. 49-67. (canceled)